Parallel and Distributed Compressed Indexes

نویسندگان

  • Luís M. S. Russo
  • Gonzalo Navarro
  • Arlindo L. Oliveira
چکیده

We study parallel and distributed compressed indexes. Compressed indexes are a new and functional way to index text strings. They exploit the compressibility of the text, so that their size is a function of the compressed text size. Moreover, they support a considerable amount of functions, more than many classical indexes. We make use of this extended functionality to obtain, in a shared-memory parallel machine, near-optimal speedups for solving several stringology problems. We also show how to distribute compressed indexes across several machines.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Massive-Scale RDF Processing Using Compressed Bitmap Indexes

The Resource Description Framework (RDF) is a popular data model for representing linked data sets arising from the web, as well as large scientific data repositories such as UniProt. RDF data intrinsically represents a labeled and directed multi-graph. SPARQL is a query language for RDF that expresses subgraph pattern-finding queries on this implicit multigraph in a SQLlike syntax. SPARQL quer...

متن کامل

HCB-Tree: A Height Compressed B-Tree for Parallel Processing

B-tree type indexes are popular in database applications because they provide a fast access path to large databases. In this paper we present a new storage structure which is suitable for fast parallel searching by using B-tree like indexes [I]. We call this modified B_ tree structure the Height Compressed B_ tree (HCB_tree). The main results presented in this paper are that parallel processing...

متن کامل

Indexes and Computation over Compressed Structured Data (Dagstuhl Seminar 13232)

Belief Change and Argumentation in Multi-Agent Scenarios (Dagstuhl Seminar 13231) Jürgen Dix, Sven Ove Hansson, Gabriele Kern-Isberner, and Guillermo Simari . . . 1 Indexes and Computation over Compressed Structured Data (Dagstuhl Seminar 13232) Sebastian Maneth and Gonzalo Navarro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Virtual Realities (Dagstuhl S...

متن کامل

Better bitmap performance with Roaring bitmaps

Bitmap indexes are commonly used in databases and search engines. By exploiting bit-level parallelism, they can significantly accelerate queries. However, they can use much memory, and thus we might prefer compressed bitmap indexes. Following Oracle’s lead, bitmaps are often compressed using run-length encoding (RLE). Building on prior work, we introduce the Roaring compressed bitmap format: it...

متن کامل

Pattern Kits

Compressed full-text indexes have been one of pattern matching’s most important success stories of the past decade. We can now store a text in nearly the information-theoretic minimum of space, such that we can still quickly count and locate occurrences of any given pattern. However, some files or collections of files are so huge that, even compressed, they do not all fit in one machine’s inter...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010